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FEATURE  EXTRACTION 


INTRODUCTION 


Since  FY80,  semiautomated  feature  extraction  has  been  studied  at  the 
Computer  Sciences  Laboratory  (CSL),  Engineer  Topographic  Laboratories  (ETL), 
as  part  of  the  5-year  Army  Feature  Extraction  Plan.  The  goal  of  this  effort 
is  to  provide  the  Defense  Mapping  Agency  (DMA)  with  a  digital  semiautomated 
feature  extraction  system  that  can  extract  Mapping,  Charting,  and  Geodesy 
(MC4G)  features  in  DMA's  production  environment.  Thus  far,  the  problem  has 
been  simplified  to  consider  only  the  easiest  of  cartographic  features,  such  as 
buildings,  roads,  forests,  fields,  and  lakes.  A  variety  of  approaches  have 
included  the  study  of  statistical  classification,  postprocessing  (relaxation 
and  binary  image  cleansing),  lineal  detectors,  and  stereo  correlation.  This 
report  describes  the  recent  development  and  evaluation  of  classification 
techniques  for  feature  extraction. 

Two  efforts  have  contributed  significantly  to  the  study  of  statistical 
classification  techniques  at  CSL.  One  of  these  efforts  was  a  series  of 
experiments  on  texture  and  image  segmentation,  done  in-house  and  discussed  in 
earlier  ETL  research  notes.*  This  work  studied  the  feasibility  of  using 
various  image  descriptors  —  Max -Min  texture,  edge  texture  measures,  and 
Ad-Hoc  measure,  and  Laws  texture  —  in  a  supervised  classification  algorithm 
to  identify  cartographic  features.  Data  reduction  on  the  descriptors  was 
attempted  using  the  divergence  measure  and  principal  components.  In  addition, 
attempts  were  made  to  reduce  misc lass ificat ion  by  relaxation  and  raster¬ 
processing  techniques. 

The  second  effort  was  the  development  of  feature  extraction  software  on 
the  Digital  Image  Analysis  Laboratory  (DIAL),  done  under  contract  with  IBM. 

The  software  was  implemented  after  an  initial  task  to  survey  the  available 
feature  extraction  techniques  was  completed.  Essentially,  two  methods  of 
classification  were  selected,  a  supervised  method  using  the  Maximum  Likelihood 
(Bayes)  algorithm  and  an  unsupervised  (Clustering)  method  based  on  the  ISODATA 
algorithm.  These  methods  and  a  number  of  associated  support  and  evaluation 
functions  were  developed  and  implemented  on  DIAL.  The  resulting  interactive 
system  is  a  research  tool  invoking  a  sophisticated  work  station  and  color 
display  capability.  The  system  is  designed  to  handle  up  to  24  channels  of 


•See  the  three  reports  written  by  Crombie,  Rand,  and  Friend  in  the 
bibliography,  and  a  fourth  report  written  by  Rand  and  Shine. 


1 


input  data,  which  can  come  from  a  variety  of  sources  including  panchromatic 
and  infrared  imagery,  LANDSAT  imagery,  and  texture  data.  Documentation  of  the 
system  and  its  use  was  published  in  two  volumes.1 ,z 

Cleansing  methods,  which  are  applied  to  the  output  of  most  any  classif¬ 
ication  process,  have  undergone  a  significant  development  on  DIAL.  Two 
approaches  have  been  tested,  namely  probabilistic  relaxation  and  raster 
processing. The  routine  for  probabilistic  relaxation  is  available  on  DlaL; 
however,  results  thus  far  have  been  discouraging  because  of  the  many 
iterations  required  to  achieve  significant  improvements  over  an  initial 
classification.  Raster  processing,  which  is  interfaced  with  STARAN's 
associative  array  processor  (and  DIAL),  has  been  very  successful  in  achieving 
good  results.  However,  the  drawback  to  this  technique  is  the  reason  for  its 
success;  raster  processing  is  a  very  interactive  procedure,  where  all 
decisions  are  made  by  the  user. 

This  research  note  updates  the  classification  procedure  currently 
available  on  DIAL  and  emphasizes  its  generality  in  solving  classification 
problems.  The  update  is  necessary  because  although  DIAL'S  software  package 
has  been  available  in  the  past,  supporting  routines  that  generate  the  image 
descriptors  of  interest  to  CSL  had  not  been  written.  DIAL  was  used  to 
classify  LANDSAT  imagery;  whereas,  texture  analysis  and  image  segmentation 
work  was  done  off  line.  However,  a  number  of  image  descriptors  can  now  be 
generated  and  utilized  by  DIAL.  All  the  data  manipulation  and  processing 
capabilities  available  to  LANDSATMSS  can  now  be  accessed  by  other  data  types, 
such  as  texture.  Both  the  supervised  maximum  likelihood  algorithm  and  the 
unsupervised  clustering  algorithm  can  be  used  in  a  highly  interactive  mode. 

Of  particular  advantage  is  the  capability  for  a  user  to  display,  as  an  image 
(or  a  supervised  set  of  images),  the  data  he  plans  to  use-  Up  to  three 
channels  can  be  displayed  at  one  time  in  pseudocolor  and  used  during 
operations  such  as  the  definition  of  training  areas. 

Section  one  is  a  brief  synopsis  of  this  image  classification  procedure  on 
DIAL  with  a  special  application  toward  the  image  descriptors  generated  by  the 
program  TEXLAW.  Section  two  discusses  an  experiment  performed  using  the  DIAL 


1W.  Rice,  J.  Shipman,  R.  Spieler,  Interactive  Digital  Image  Processing 
Investigation  prepared  for  U.S.  Army  Engineer  Topographic  Laboratories,  Fort 
Belvoir,  VA,  ETL-0172,  December  1978,  AD-A076  342. 

O 

W.  Rice,  J.  Shipman,  R.  Spieler,  Interactive  Digital  Image  Processing 
Investigation.  Phase  II.  prepared  for  U.S.  Army  Engineer  Topographic 
Laboratories,  Fort  Belvoir,  VA,  ETL-0221 ,  April  1980,  AD-A087  5 1 8 . 

^A.  Rosenfeld,  R.  Hummel,  S.  Zucker,  "Scene  Labeling  by  Relaxation 
Operations,"  IEEE  Transactions  on  Systems.  Man,  and  Cybernetics,  vol.  SMC-6, 
June  1976. 

a 

N.  Friend,  Analysis  of  Interactive  Image  Cleansing  Via  Raster-Processing 
Techniques.  U.S.  Army  Engineer  Topographic  Laboratories,  Fort  Belvoir,  VA, 
ETL-0347,  November  1983,  AD-A141  772. 


software.  The  experiment  tests  various  combinations  of  the  Ad-Hoc  and  Laws 
image  descriptor,  and  it  does  so  under  a  procedure  that  has  eliminated  some  of 
the  weaknesses  inherent  in  the  earlier  experiments.  For  example,  the  training 
regions  had  been  confined  to  rectangular  regions  and  thus  included  clutter 
that  was  not  representative  of  the  classes  being  defined.  The  data  were 
tested  on  two  independent  algorithms,  the  supervised  and  the  unsupervised 
classification  algorithms. 

Unfortunately,  the  results  of  this  experiment  as  well  as  those  of  all  the 
earlier  experiments  show  that  statistical  classification  methods,  by  them¬ 
selves,  are  insufficient  to  satisfy  the  requirements  of  a  semiautomated 
cartographic  feature  extraction  system.  The  only  hope  for  these  methods  is 
that  they  may  have  some  use  if  used  in  conjunction  with  other  techniques. 

Such  an  effort,  that  is,  to  coordinate  various  feature  extraction  techniques 
under  a  "rule-based"  system,  has  started  recently.  Initially,  methods  such  as 
edge  and  boundary  detectors  are  likely  t<"*  receive  first  consideration  for 
implementation  under  this  rule-based  3yst-;in. 


THE  CURRENT  CLASSIFICATION  PROCEDURE  USING  DIAL 


Introduction  to  the  DIAL  System.  The  Digital  Image  Analysis  Laboratory 
(DIAL)  is  an  interactive  system  that  has  been  used  at  ETL  to  research  a 
variety  of  mapping  and  photointerpretation  techniques.  Some  of  DIAL'S 
capabilities  include  gray  level  mapping,  magnification,  filtering  operations, 
mosaics,  warping,  scrolling,  targeting,  image  fusion,  stereo  matching  and 
compilation  of  elevation  data,  and  perspective  viewing.  Another  capability  of 
DIAL  is  the  feature  extraction  methodology  discussed  in  this  report. 

DIAL  consists  of  two  work  stations  connected  to  a  mainframe  computer 
system  via  a  PDP  11/50  minicomputer.  Each  work  station  has  a  command  station 
—  tektronix  keyboard  with  display,  two  trackballs,  and  an  x-y  tablet  —  and 
two  color  display  screens.  The  tektronix  display  in  each  work  station  is 
linked  to  a  copying  unit,  and  one  of  the  color  displays  is  linked  to  a  DUNN 
631  color  camera  system.  The  two  work  stations  also  use  a  CYBER  170 
sequential  computer  and  have  access  to  a  STARAN  associative-array  processor. 
Peripheral  units  include  eight  disk  drives  and  four  magnetic  tape  units.  The 
system  is  outlined  in  figure  1 . 

The  system  software  on  DIAL  is  made  to  support  a  modular  program 
structure.  Typically,  a  programmer  wishing  to  perform  a  particular  task  codes 
his  algorithm  as  a  DIAL  program  module  (PM).  Users  of  DIAL  can  then  call  the 
module  from  one  of  the  work  stations.  The  PM's  are  called  individually,  with 
any  output  stored  on  a  DIAL  file  (or  a  set  of  DIAL  files).  Subsequent  calls 
to  this  PM  or  other  PM's  can  use  the  file  and  produce  other  files.  Thus,  a 
sequence  of  PM's  can  be  used  to  perform  a  number  of  small  tasks  that  build  on 
each  other,  resulting  in  the  completion  of  some  larger  task. 

Description  of  the  Classification  Procedure.  The  current  classification 
procedure  on  DIAL  can  be  divided  into  four  major  blocks:  data  preparation, 
data  modeling,  classification,  and  postprocessing.  In  general,  the  blocks  are 
performed  sequentially  with  the  option  to  repeat  and  add  more  data,  if 
necessary.  A  diagram  of  these  blocks  and  the  steps  within  each,  along  with 
the  supporting  software,  is  shown  in  figure  2. 

Data  Preparation.  In  the  data  preparation  block,  the  user  must 
acquire  or  generate  multichannel  data.  A  variety  of  digital-data  types  can  be 
used,  including  digitized  panchromatic  and  infrared  images,  LANDSAT  images, 
and  texture  data,  depending  on  the  application.  Up  to  24  channels  of  input 
can  be  accommodated,  and  using  a  combination  of  various  data  is  perfectly 
acceptable  as  long  as  the  resulting  channels  are  registered.  Each  channel  is 
a  plane  of  data  (such  as  a  component  to  some  texture  vector  or  a  band  from 
LANDSAT  imagery);  therefore,  if  the  available  data  are  stored  as  vectors,  the 
components  must  be  separated  into  planes.  The  only  system  requirement  for  the 
data  is  that  each  channel  be  stored  as  a  DIAL  image.  Thus,  if  the  data 
consist  of  nonintegral  numbers  —  such  as  is  the  case  with  many  texture 
measures  —  the  numbers  need  to  be  transformed  appropriately  to  be  integers  so 
that  the  data  can  be  made  into  an  image.  Once  the  channels  of  data  are 
available  as  registered  DIAL  images,  the  final  operation  in  this  block  is  to 
build  a  "composite"  image.  It  is  this  composite  image  that  is  used  in  the 
computations  involved  in  the  Data  Modeling  and  Classification  blocks. 
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Figure  2.  Diagram  of  classification  procedure  on  DIAL 


There  are  six  PM's  that  assist  the  data  preparation  process:  TEXLAW, 
MSSRFT,  RATIOF,  INTERL,  SAVE,  and  KLTRAN.  The  program  TEXLAW  generates 
various  image  descriptors  and  creates  a  DIAL  image  for  each  component  of  the 
descriptor  selected.  Descriptors  include  the  measures  previously  studied  by 
CSL,  such  as  the  Ad-Hoc  measure^  and  Laws  texture  measure.  The  program 
MSSRFT  is  used  when  accessing  LANDSATMSS  data  to  transform  data  residing  on 
"computer  compatible  tapes"  (CCT's)  thereby  creating  a  set  of  four  DIAL  images 
on  disk.  Neither  TEXLAW  nor  MSSRFT  operate  as  DIAL  PM's;  however,  they  serve 
the  function  of  preparing  data  for  the  remaining  programs  that  do  run  on 
DIAL.  Therefore,  they  are  Included  as  part  of  DIAL'S  feature  extraction 
package.  The  program  RATIOF,  as  its  name  suggests,  ratios  one  DIAL  image  to 
another;  the  value  of  each  point  in  the  "numerator"  image  is  divided  by  the 
corresponding  point  in  the  "denominator"  image;  its  output  is  a  DIAL  image  of 
the  ratio.  The  program  INTERL  interleaves  interactively  selected  images, 
building  a  composite  image  of  up  to  24  channels  in  a  band-interleaved-by-pixel 
format.  SAVE  is  used  after  INTERL  and  RATIOF  to  save  an  image  permanently,  if 
desired;  it  is  also  used  after  the  program  KLTRAN. 

The  KLTRAII  program  enables  the  user  to  transform  a  composite  image 
consisting  of  N  channels  (N  24)  into  a  set  of  N  principal-component 
Images,  based  on  a  covariance  matrix  derived  from  the  composite  in  the  CLASTAT 
PM.  Because  the  covariance  matrix  is  needed,  one  must  make  at  least  one  pass 
through  the  Data-Modeling  block  before  using  KLTRAN,  and  then  repeat  the  Data- 
Preparation  block  with  the  principal-component  data.  In  KLTRAN,  the 
covariance  matrix  is  used  in  a  Karhunen-Loeve  transformation  to  create  a  new 
composite  image  consisting  of  the  principal  components  of  the  original 
composite.  Since  a  subset  of  the  principal  components  is  usually  desired,  the 
user  will  typically  invoke  the  option  in  KLTRAN  that  separates  the  desired 
components  into  a  set  of  individual  DIAL  images.  At  this  point,  INTERL  must 
be  called  again  to  interleave  the  new  subset,  followed  by  a  call  to  SAVE. 

Data-Modeling.  The  data-modeling  block  establishes  the  initial 
conditions  and  support  data  needed  during  classification.  This  block  is 
implemented  via  FIELDEF  and  CLASTAT.  The  effect  of  the  FIELDEF  routine  is  to 
create  a  field/class  DIAL  file  storing  the  locations  of  a  set  of  fields  (image 
subareas  of  interest  to  the  user).  The  effect  of  CLASTAT  is  to  extend  the 
file  with  a  set  of  classes,  consisting  of  statistical  models  for  the  fields  of 
interest. 

The  usage  of  FIELDEF  and  CLASTAT  varies  depending  on  whether  one  will 
later  choose  to  classify  the  data  using  the  supervised  classification  routine 
(MAXLIK)  or  the  unsupervised  routine  (CLUSTER).  If  an  unsupervised  approach 
is  taken  and  if  the  starting  vectors  are  self-generated  (see  the  discussion  on 


JM.  Crombie,  N.  Friend,  R.  Rand,  Feature  Component  Reduction  Through 
Divergence  Analysis,  U.S.  Army  Engineer  Topographic  Laboratories,  Fort 
Belvoir,  VA,  ETL-0305,  October  1982,  AD-A123  474. 

**R.  Rand  and  J.  Shine,  Feature  Analysis  and  Reduction  of  the  Laws  Texture 
Measure ,  U.S.  Army  Engineer  Topographic  Laboratories,  Fort  Belvoir,  VA,  ETL- 
0343,  October  1983,  AD-A138  366. 


i! 


>:v.v.*:v:vv 


CLUSTER  routine  on  page  10),  no  training  sets  need  be  defined,  and  only 
FIELDEF  is  called  to  define  the  subimage  for  classification;  however,  if  the 
starting  vectors  are  not  self-generated  but  defined  according  to  the 
statistics  of  selected  training  areas,  both  FIELDEF  and  CLASTAT  are  used  in  a 
manner  similar  to  that  of  taking  the  supervised  approach.  As  an  option, 
FIELDEF  can  also  be  used  here  to  define  reference  areas  to  analyze  results. 
That  is,  if  a  region  (such  as  a  forest  area)  is  outlined,  the  final  results  in 
this  area  can  later  be  viewed  yielding  such  information  as  the  percentage  of 
points  correctly  assigned  to  the  cluster  (associated  with  forest). 

If  the  data  are  classified  using  the  supervised  approach,  both  FIELDEF  and 
CLASTAT  are  invoked.  In  this  case,  FIELDEF  is  called  to  define  the  areas 
(fields)  that  will  be  used  by  CLASTAT  in  computing  training  sets,  as  well  as 
to  define  the  classification  and  reference  areas.  CLASTAT  generates  the 
training  sets  needed  by  MAXLIK.  A  training  set  is  a  set  of  class  records, 
each  record  being  a  statistical  model  of  a  class  and  consisting  of  a  mean 
vector  and  a  covariance  matrix  of  an  area  (field)  previously  defined  in 
FIELDEF.  In  addition  to  their  function  in  the  MAXLIK  algorithm,  the  mean 
vectors  in  the  training  sets  are  also  what  are  used  to  generate  the  starting 
vectors  needed  in  a  class-generated  clustering  approach. 

To  aid  in  the  selection  of  training  models,  CLASTAT  can  measure  the 
distance  between  two  classes.  This  measure,  called  the  Bhattacharyya 
distance,  has  values  ranging  between  zero  and  one.  A  value  of  one  indicates 
there  is  no  separation  between  two  classes;  a  value  very  close  to  zero 
indicates  good  separation.  The  order  of  magnitude  is  what  is  important.  For 
example,  a  distance  value  of  10“'2  would  show  good  separation;  whereas,  a 
value  of  10_1  would  show  poor  separation.  Another  criterion  to  selecting  good 
classes  is  to  check  the  mean  and  standard  deviation  of  each  channel. 

Generally,  if  the  standard  deviations  are  high  compared  to  the  means,  the 
corresponding  area  on  the  image  is  not  homogeneous  and  the  class  will  not  make 
a  good  training  model. 

To  obtain  the  optimum  training  set,  a  user  may  have  to  switch  back  and 
forth  between  FIELDEF  and  CLASTAT,  taking  advantage  of  the  highly  interactive 
nature  of  each  program.  Typically,  the  optimum  set  will  be  derived  from  a  set 
of  homogeneous  fields  that  are  well  separated.  The  display  capability  in 
FIELDEF  is  particularly  useful  in  selecting  the  fields,  since  the  character¬ 
istics  of  homogeneity  and  separability  can  usually  be  spotted  visually. 

FIELDEF  will  superimpose  any  three  of  the  composites'  channels  on  color 
overlays.  When  selecting  a  field,  one  uses  this  capability  and  with  a  cursor 
draws  a  polygon  or  line  segment  of  up  to  12  vertices.  With  the  exception  of 
the  cluster  routine,  FIELDEF  and  CLASTAT  are  by  far  the  most  interactive  PM's 
for  the  DIAL  classification  package.  The  user  will  spend  most  of  his  time 
here. 

Classification.  In  the  classification  block,  the  user  implements  one 
of  two  classification  routines,  using  the  data  derived  from  blocks  one  and 
two.  MAXLIK  is  the  DIAL  PM  that  performs  the  supervised  classification; 
CLUSTER  performs  the  unsupervised  classification. 
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The  MAXLIK  routine  classifies  a  subimage  in  the  composite  image,  assigning 
each  point  (data  vector)  to  the  class  with  the  smallest  value  of  the 
likelihood  function.  Essentially,  the  tasks  involve: 

1.  Selecting  a  composite  image. 

2.  Selecting  fields  from  a  field/class  file. 

3.  Selecting  classes  from  a  field/class  file. 

4.  Assigning  a  color  and  a  character  to  each  class. 

5.  Selecting  the  complexity  of  the  discriminator. 

6.  Assigning  a  priori  weights  to  classes. 

7.  Displaying  results. 

By  selecting  a  set  of  fields,  the  user  has  defined  the  subimage  to  be 
classified  as  the  smallest  rectangular  area  in  the  image  enclosing  the  set. 
This  subimage  can  be  displayed,  if  desired.  The  set  of  classes  selected  by 
the  user  will  be  used  as  the  training  model  to  assign  labels  to  each  point  in 
the  subimage.  Up  to  10  classes  can  be  chosen.  A  color  is  assigned  to  each 
class  so  that  the  results  can  be  displayed  as  a  color  map  on  one  of  the 
monitors.  The  map  is  a  DIAL  image  file  containing  a  pseudocolor  function 
memory  that  can  be  saved  and  later  redisplayed.  An  option  to  assign 
characters  to  each  class  is  available  so  that  the  results  can  be  viewed  in 
greater  detail  (point  by  point)  on  either  the  work  station  screen  or  the  line 
printer. 

The  complexity  of  the  discriminator  varies  depending  on  the  form  that  the 
covariance  matrix  (of  the  class  models)  takes  in  the  likelihood  function.  The 
user  has  the  option  to  use  the  full  covariance  matrix,  the  diagonal  elements 
of  the  covariance  matrix,  the  expected  value  of  the  trace  of  the  covariance 
matrix,  or  the  identity  matrix.  Except  for  the  full  covariance,  the  other 
options  lead  to  successively  greater  approximation  in  the  discriminator  (the 
likelihood  function). 

L^x)  -.log  k  1  +  (x-u  )T  -  2  log  Pfc  (1) 

where  11 

X  =  data  vector  for  point  in  question 

uk  *  mean  vector  for  the  Ktl1  class 

-  covariance  matrix  for  the  KC  class 

Pk  *  a  priori  weight  for  the  class 

The  greater  the  approximation,  the  lesser  the  number  of  computations  required 
in  the  second  term.  For  example,  the  full  covariance  contains  K(K+l)/2 
elements,  whereas,  the  diagonal  of  the  covariance  contains  only  K  elements; 
using  the  identity  matrix  eliminates  the  second  term  entirely.  As  for 
assigning  values  to  the  a  priori  weights  the  default  is  to  assign  equal 
weights  to  all  the  classes.  Such  a  default  eliminates  the  need  for  the  third 
term,  an  additional  approximation. 

Synonymous  with  MAXLIK  is  another  DIAL  PM  called  PLABEL.  Basically, 

PLABEL  is  a  modified  version  of  MAXLIK  that  was  made  so  that  the  results  of 
MAXLIK  could  be  smoothed  by  a  probabilistic  relaxation  algorithm.  The 
implementation  of  PLABEL  is  almost  identical  with  that  of  MAXLIK,  except  that 
the  processing  time  of  PLABEL  is  somewhat  longer.  Therefore,  if  one  wishes  to 
process  the  results  of  MAXLIK  with  the  relaxation  algorithm,  one  should  use 
PLABEL  in  place  of  MAXLIK;  otherwise  one  should  use  MAXLIK. 
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The  CLUSTER  routine  is  an  iterative  unsupervised  classification  process 
based  on  Ball  and  Hall's  ISODATA  (Iterative  Self-Organizing  Data  Analysis 
Techniques  A)  algorithm. '  The  algorithm  makes  multiple  passes  through  the 
data,  assigning  data  to  clusters  and  splitting  or  combining  clusters  in  a 
user-defined  sequence.  Essentially,  the  tasks  involve: 

1.  Selecting  a  composite  image. 

2.  Selecting  fields  from  a  field/class  file. 

3.  Selecting  cluster  parameters. 

4.  Selecting  starting  vectors. 

5.  Performing  a  cluster  sequence. 

6.  Displaying  results. 

7.  Stopping  or  going  back  to  task  3  and  reiterating. 

In  completing  task  1  and  task  2,  the  user  has  defined  the  subimage  that  will 
be  processed  into  clusters.  As  is  the  case  with  MAXLIK,  this  subimage  is  the 
smallest  rectangular  area  on  the  image  enclosing  the  set  of  fields.  The 
sub image  can  be  displayed  on  a  monitor,  if  desired.  There  are  11  initial 
clustering  parameters,  along  with  a  split/combine  sequence,  that  must  be 
specified.  Such  information  as  thresholds  for  splitting  and  combining 
clusters,  scale  factors,  the  distance  measure  used  in  combining  clusters,  the 
distance  measure  used  in  assigning  data  points  to  clusters,  the  minimum  number 
of  data  points  in  a  cluster,  and  the  maximum  number  of  clusters  allowed,  is 
supplied  in  this  task.  Since  a  few  of  these  parameters  are  sensitive  to  the 
characteristics  of  the  data,  a  certain  amount  of  experimentation  might  be 
necessary  to  obtain  optimal  results. 

Starting  vectors  are  used  to  specify  cluster  centers  for  the  initial 
assignment  of  data  to  clusters.  There  are  three  methods  available: 

1.  Self -generated,  in  which  the  routine  selects  a  single  starting 
vector  consisting  of  zeros  in  all  channels. 

2.  Cla ss -genera ted ,  in  which  the  user  selects  up  to  50  previously 
defined  classes  (from  a  field/class  file)  as  starting  vectors. 

3.  Previously  generated,  in  which  the  starting  vectors  are  defined 
as  the  clusters  determined  during  the  immediately  preceding 
cluster  run  of  the  present  DIAL  session. 

Of  course  this  third  method  cannot  be  exercised  until  the  cluster  sequence 
has  been  performed  at  least  once.  The  use  of  previously  generated  starting 
vectors  is  particularly  useful  when  one  wishes  to  extend  the  clustering 
sequence  after  viewing  the  displayed  results  in  task  6. 

After  the  cluster  parameters,  the  starting  vectors,  and  the  split/combine 
sequence  are  defined,  it  is  a  simple  matter  of  one  command  to  implement  the 
clustering  process.  Upon  completion  of  the  process,  the  user  can  immediately 
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display  the  resulting  class  nap,  since  a  color  code  has  automatically  assigned 
a  color  to  each  cluster.  If  desired,  the  user  can  assign  a  character  to  each 
cluster  and  make  a  point-by-point  analysis  of  the  results.  At  this  point,  if 
the  results  are  satisfactory,  the  user  may  stop.  Otherwise,  the  process  can 
continue  by  returning  to  task  3.  The  clustering  can  then  proceed  from  scratch 
or  from  the  previously  generated  clusters. 

Postprocessing.  After  a  scene  has  been  classified,  there  are  two 
methods  available  on  DIAL  for  image  smoothing:  probabilistic  relaxation  and 
raster  processing.  The  relaxation  method  is  based  on  an  approach  developed  by 
A.  Rosenfeld  at  the  University  of  Maryland.®  This  method  considers  the 
influence  of  a  neighborhood  on  a  point,  and  exploits  concepts  of  information 
theory  to  modify  a  point's  existing  class  label.  Each  point  must  have  been 
assigned  a  set  of  class  labels  along  with  a  set  of  probabilities  associated 
with  the  labels,  rather  than  one  definite  class  label.  As  an  option,  a 
certain  amount  of  external  knowledge  about  the  compatibility  of  the  classes 
can  be  embedded  into  the  process  using  a  compatibility  matrix.  The  relaxation 
function  then  updates  a  class  label  based  on  the  information  about  the  point 
and  its  surrounding  area,  attempting  to  minimize  the  entropy  of  the 
probability  set  associated  with  the  point.  The  formula  for  updating  the 
probability  p"^  (X^)  for  the  (i,j)  point  of  the  Kth  class  is 

pij  <V  [‘  +  <y] 

- (2) 

j/Ii  <V  [‘  +  q"j  <v] 

where 

n  r  „  n  (X  ) 

q  (X  )  -  I  C  I  r  (X  X  )  P  k' 
ij  k  l,m  ijlm  k'  ijlm  k  k'  lm 

and 

n  ■  superscript  that  indicates  the  iteration  number 

rijln/*k  \')  *  the  compatibility  of  label  X  for  the  point  (i,j) 
with  the  label  X^  for  point  (l,m);  takes  on  values  in  the  interval  [-1,  1] 

Culm  *  the  weighting  of  the  points  (l,m)  that  are  neighbors 

of  point  (i,  j); 

0<C  <1  and  J  C  =1  for  each  point  (i,j) 

ijlm  —  lm  ijlm 


£  P  (X  )  -  1  for  all  pairs  (1,1)  and  all  interations  N 
K-l  ij  k 

The  DIAL  routines  for  implementing  the  relaxation  process  are  RELAX  and 
ITRES.  RELAX  will  update  the  class  labels  and  the  associated  probability 

g  A.  Rosenfeld,  R.  Hummel,  S.  Zucker,  "Scene  Labeling  by  Relaxation 
Operations,"  IEEE  Transactions  on  Systems,  Man,  and  Cybernetics,  vol.  SMC-6, 
June  1976. 
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estimates.  The  user  interactively  selects  the  values  for  the  compatibility 
matrix  r  and  weighting  (distance)  matrix  C.  RELAX  takes  the  class  maps 
generated  from  PLABEL  as  input.  Recall  that  PLABEL  is  a  modified  version  of 
MAXLIK  designed  specifically  to  generate  data  for  the  RELAX  routine.  ITRES  is 
a  displaying  routine  that  presents  the  results  yielded  by  RELAX. 

The  second  method  for  image  smoothing  is  the  raster-processing 
approach.  The  name  is  derived  from  the  fact  that  the  RASTER  PM  processes 
data  in  a  raster  format,  as  opposed  to  a  vector  format.  The  method  is  subjec¬ 
tive  since  the  user  is  the  one  who  decides  what  areas  are  to  be  retained  and 
what  areas  are  to  be  erased,  and  what  areas  are  to  be  expanded  and  what  areas 
are  to  be  shrunk.  The  RASTER  PM  was  written  by  Goodyear  Aerospace  Corporation 
for  the  STARAN  associative  array  processor,  which  is  ideally  suited  to 
processing  raster  data  since  the  data  are  generally  stored  in  an  array  format. 

Since  RASTER  requires  binary  imagery,  the  standard  procedure  for  using 
RASTER  in  a  classification  sequence  is  to  establish  binary  planes  of  data,  one 
for  each  class  that  is  identified.  These  class  planes  contain  values  of  one 
or  zero;  one,  if  a  point  is  a  member  of  the  class  and  zero,  if  it  is  not.  The 
user  then  interactively  cleanses  these  images  via  calls  to  the  nine  available 
raster  functions. 

Program  TEXLAV  The  purpose  of  TEXLAW  is  to  generate  image  planes  of 
texture  data  from  single-channel  gray  shade  images.  Two  texture  measures 
previously  studied  by  ETL,  the  two-component  Ad-Hoc  texture  measure  and  the 
Laws  texture  measure,10  can  be  confuted  using  this  routine.  In  addition,  many 
other  texture  measures  can  be  constructed  using  the  program's  two-step 
procedure.  The  resulting  texture  data  are  stored  as  a  set  of  DIAL  image 
planes,  each  plane  corresponding  to  one  texture  component ,  and  the  set  is 
compatible  with  the  DIAL  program  modules  for  classifying  multichannel  imagery. 

A  two-step  procedure  is  used  to  compute  the  image  descriptors.  In  the 
first  step,  an  intermediate  image  is  generated  by  convolving  the  original 
image  with  a  symmetric  mask  defined  by  the  user.  In  the  second  step,  the 
intermediate  image  is  operated  on  with  one  of  three  pairs  of  window  functions 
selected  by  the  user.  The  purpose  of  this  second  step  is  to  produce 
components  that  are  either  low  frequency  functions  (creating  a  blurring 
effect)  or  high  frequency  functions  (energy  measures  that  enhance  edges  of 
structural  information)  of  the  intermediate  image.  These  various  texture 
measures  can  be  divided  into  two  groups: 

TEXTURE  MEASURES:  GROUP  1 


The  two-component  Ad-Hoc  texture  measure  (average  and  standard 
deviation) . 
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The  two-component  correlation  texture  measure  (magnitude  and 
direction) . 

The  two-component  gradient  measure  ( magnitude  and  direction). 

TEXTURE  MEASURES:  GROUP  2 

The  Laws  texture  measure  (individual  components). 

Variations  of  Laws  texture  (individual  components;  the  energy 
measure  is  replaced  by  either  the  correlation  or  gradient 
measure) . 

There  are  two  types  of  sampling  used  in  TEXLAW.  The  first  type  occurs  for 
selecting  a  regularly  spaced  grid  of  points  on  both  the  source  and  the 
intermediate  image  and  is  defined  by  the  sampling  factors  "SKIP1"  and 
"SKIP2.W  The  effect  of  this  sampling  is  to  reduce  the  size  of  the 
intermediate  image  and  size  of  the  DIAL  images.  (Example:  NPIX  =  (NP- 
D/SKIP1  +  1,  where  NP  is  the  number  of  pixels  on  the  source  and  NPIX  i3  the 
resulting  number  of  pixels  on  a  record  of  the  intermediate  image;  and  NPIX2  = 
(NPIX-1 )/SKIP2  +  1,  where  NPIX2  is  the  resulting  number  of  pixels  on  a  record 
of  the  DIAL  image.) 

The  other  type  of  sampling  is  an  option  used  to  reduce  the  number  of 
corputations  during  the  window  operation  in  the  second  step.  If  WSZ**2  is  the 
number  of  points  used  in  the  computation  and  NEXP  is  the  expansion  (sampling) 
factor  used  to  select  these  points,  the  effective  size  of  the  window  covers  an 
area  of  NWSZ**2,  where  NWSZ  =  (W5Z-1)*NEXP  +  1.  Therefore,  large  window 
operations  can  be  simulated  at  the  same  cost  as  performing  small  window 
operations.  This  option  was  added  to  enable  a  user  to  experiment  with  the 
possibility  of  using  large  window  areas  in  the  texture  operation  without  using 
the  corresponding  large  number  of  points. 

TEXLAW  can  be  used  iteratively.  The  DIAL  images  that  are  generated  are 
filled  at  the  edges  so  that  windowing  does  not  decrease  their  size.  Thus, 
output  images  are  registered  to  the  input  images  when  SKIP1  =  SKIP2  =  1 . 

These  outputs  can  be  used  as  input  to  another  iteration.  For  example,  two 
images  can  be  created  using  AVE  and  STD  as  a  first  iteration.  Another  run  can 
follow  this  with  the  input  image  from  STD,  using  AVE  and  STD  to  generate  two 
additional  outputs,  the  first  a  blur  of  the  STD  image  and  the  second  an  energy 
measure  of  the  STD  image.  Thus,  higher  order  texture  measures  can  be  computed 
from  lower  order  measures  using  an  iterative  strategy. 

AN  EXPERIMENT  USING  DIAL'S  CLASSIFICATION  PROCEDURE 

A  classification  experiment  was  performed  on  DIAL  following  the  procedure 
described  in  the  previous  section.  The  purpose  of  the  experiment  was  to  test 
some  of  the  more  promising  image  descriptors  considered  in  earlier  studies 
under  a  more  flexible  and  interactive  classification  system.  Because  of  a 
rather  rigid  classification  procedure  in  the  previous  studies,  the  possibility 
existed  that  the  poor  performance  of  the  image  descriptors  was  due  not  only  to 
their  own  weaknesses  in  supplying  descriminatory  information  but  also  to 
weaknesses  in  the  classification  procedure.  If  the  classification  scenario 
was  optimized,  perhaps  the  performance  of  the  descriptors  could  be  improved. 

In  addition,  there  was  the  possibility  that  a  combination  of  the  more 
promising  image  descriptors  would  improve  performance. 


Description  of  Experiment.  The  source  data  in  the  experiment  was  a 
panchromatic  image  containing  1024  by  1024  pixels  and  having  a  ground 
resolution  of  1  meter.  The  scene,  called  scene  A,  is  a  rural  area  containing 
forests,  fields,  a  road,  and  a  few  buildings  and  has  been  used  in  all  of  the 
previous  studies  on  texture  and  image  segmentation.  This  scene  is  shown  in 
figure  3- 

The  experiment  can  be  thought  of  as  consisting  of  two  parts.  In  part  A, 

39  images  were  generated  from  program  TEXLAW,  and  from  these,  the  most 
promising  were  used  in  both  a  supervised  (MAXLIK)  and  unsupervised  (CLUSTER) 
classification  run.  The  emphasis  here  was  on  choosing  a  small  set  of  image 
components.  Selecting  the  fields  and  classes  was  a  highly  interactive 
procedure  requiring  a  number  of  field/class  definitions  and  mergings  before 
obtaining  what  was  judged  to  be  an  optimal  training  model.  In  part  B, 
principal  component  images  were  generated  from  some  of  the  images  created  in 
part  A,  and  the  effect  of  both  decreasing  and  increasing  the  number  of 
components  was  tested.  Only  supervised  classification  runs  were  invoked. 

Part  A.  Data  preparation  (Block  1)  was  initiated  by  generating  16 
pairs  of  images  from  scene  A,  where  each  pair  —  a  convolved  image  and  an 
energy  image  —  corresponded  to  a  particular  LAWS  window  (see  section  on 
program  TEXLAW  and  the  appendix).  In  addition,  15  ratioed  images  were 
generated  by  ratio ing  the  last  15  energy  images  to  the  first  energy  image 
using  program  RATIOF,  and  the  ratioed  images  corresponded  to  the  15  texture 
components  defined  by  Laws.  Figures  4  and  5  show  the  pair  of  images  generated 
from  the  LAWS  window  LL.  In  the  top  image,  every  other  point  on  scene  A  was 
convolved  with  LL  (a  5X5  window  was  used).  The  bottom  image  is  the  result  of 
computing  the  standard  deviation  about  every  point  in  the  convolved  image 
(13X13  window).  Figures  6  and  7  show  the  pair  of  images  generated  from  the 
LAWS  window  EE.  The  result  of  using  program  RATIOF,  with  the  EE  energy  image 
as  the  numerator  and  the  LL  energy  image  as  the  denominator,  is  shown  in 
figure  8.  For  a  listing  of  the  16  LAWS  convolution  masks  along  with  an 
explanation  of  their  derivations,  see  the  appendix. 

The  data  preparation  was  completed  by  selecting  those  images  that  appeared 
to  contain  the  most  discriminatory  information  and  building  them  into  a  single 
composite  image.  Selecting  the  images  was  done  visually  by  displaying  each  of 
them  on  one  of  the  work  station’s  monitors  and  subjectively  choosing  those 
that  collectively  gave  the  most  discriminating  ability  to  a  human  observer. 
Since  at  this  point  a  human  being  can  outperform  the  computer  in  identifying 
cartographic  features,  it  was  assumed  that  if  a  person  fails,  then  the 
computer  will  certainly  fail.  The  most  significant  images  were  those 
associated  with  the  LL  and  EE  images  (particularly  LL).  Of  these,  three  were 
selected:  the  convolved  image  generated  by  LL,  called  A54CONLL;  the 
corresponding  energy  image,  called  A54STDLL;  and  the  energy  image  generated  by 
EE,  called  A54STDEE.  The  A  designates  scene  A  and  54  designates  the  exposure 
number  of  the  image;  CON  represents  convolution  and  STD  represents  standard 
deviation.  The  three  images  were  built  into  a  composite  using  program  INTERL. 


The  Data  Modeling  (block  2)  was  initiated  by  calling  the  program  FIELDEF, 
and  superimposing  A54CONLL,  A54STDLL,  and  A54STDEE  on  three  color  planes. 

This  superimposed  set  is  displayed  as  a  false-color  image,  in  which  the 
intensity  of  each  plane  can  be  adjusted  to  emphasize  important  features. 
Regions  in  the  scene  that  were  thought  to  be  of  use  in  the  training  model  were 
outlined  with  a  cursor  using  a  trackball.  A  number  of  polygons  were  drawn 
around  forest,  field,  and  building  areas.  Some  line  segments  were  drawn  along 
the  road  in  the  scene.  Table  1  shows  a  complete  list  of  the  fields  that  were 
later  used  to  construct  the  class  models.  Note  the  numbers  in  the  middle 
column  that  depict  the  vertices  of  the  polygons  or  line  segments.  Note  also 
that,  in  addition  to  the  small  fields,  a  large  field  "BASEIMAGE"  was 
defined.  This  large  field  defined  the  area  on  the  scene  that  was  to  be 
classified;  essentially  the  entire  image.  Figure  9  shows  these  fields 
overlayed  on  top  of  the  false-color  image. 


Figure  9 .  False-color  display  of  image  components. 


Having  collected  a  set  of  fields  to  work  from,  the  operator  called  on 
CLASTAT  to  construct  the  training  classes.  The  process  was  iterative,  where 
classes  were  created,  and  then  using  the  Bhattacharyya  distance  measure  were 
tested  for  separability.  Table  2  lists  the  resulting  classes.  Initially, 
eight  classes  were  created. 

1.  The  class  FOREST  1  from  the  field  HEAVYFOREST . 

2.  The  class  FOREST  2  from  the  fields  LIGHTFOREST  1,  LIGHTFOREST 

2,  LIGHTFOREST  IB. 

3.  The  class  FOREST  3  from  the  field  LIGHTFOREST  1A 

4.  The  class  FIELD  1  from  the  field  FIELD  1 

5.  The  class  FIELD  2  from  the  fields  ROUGHFIELD,  ROUGHFIELD  1 

6.  The  class  ROAD  from  the  field  ROADFIELD 

7.  The  class  BUILDING  from  the  field  BUILDING  1 

8.  The  class  SHADOW  from  the  field  SHADOW  1 


Table  1.  Listing  of  Field  Results 
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Table  2.  Listing  of  Class  Results  for  Three  Components 
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One  can  see  from  the  Bhattacharyya  distance  measures  listed  in  table  3  that 
some  of  these  training  classes  were  almost  inseparable.  Classes  2,  3,  and  5 
had  a  distance  of  10"'  or  greater;  likewise  for  classes  6  and  7.  Also  class 
8,  corresponding  to  the  shadow  areas  surrounding  buildings,  had  a  distance  of 
10"'  between  class  1  and  class  2.  Thus,  the  following  merger  and 
redefinition  took  place: 

Class  10  (HFOREST)  was  renamed  from  class  1 . 

Class  11  (FIELD)  was  renamed  from  class  4. 

Class  9  (FSTFIELD)  was  constructed  by  merging  classes  3  and  5. 

Class  12  (BLGROAD)  was  constructed  by  merging  classes  6  and  7. 

The  shadow  class  was  dropped  from  consideration,  since  a  merger  of  this  class 
with  class  1  and/or  class  2  would  probably  lead  to  greater  confusion  between 
these  two  classes;  the  separation  between  class  1  and  class  2  was  not  that 
good  anyway. 

Using  the  set  of  fields  and  classes  gathered  thus  far,  the  classification 
of  scene  A  (step  3)  was  performed  twice  via  MAXLIK  and  CLUSTER.  Implementing 
the  MAXLIK  PM  was  straightforward.  The  field  BASEIMAGE,  was  selected  to 
define  the  area  of  classification  and  classes  9,  10,  11,  and  12 — depicted 
by  stars  to  the  left  of  the  class  names  in  table  2 — were  used  as  a  training 
model.  Colors  to  be  associated  with  classes  9,  10,  11,  and  12  were  chosen  as 
yellow,  blue,  green,  and  red,  respectively.  The  standard  Bayes  classifier  (no 
approximations)  was  invoked.  Figure  10  shows  the  results  of  the  classifi¬ 
cation  as  a  class  map,  and  table  4  shows  a  table  of  point-by-point  results  of 
the  upper  left  corner  of  the  scene. 

The  implementation  of  CLUSTER  was  less  straightforward,  requiring  much 
experimentation  to  determine  the  best  clustering  strategy  and  the  most 
effective  parameter  values.  Until  a  strategy  could  be  found,  small  areas  of 
the  scene  were  processed  to  save  time  and  cost.  After  a  few  trials,  the 
"class-generated"  approach  to  selecting  starting  vectors  was  found  to  be  the 
most  reliable  and  accurate.  Considering  that  the  classes  had  already  been 
defined,  this  approach  was  also  quicker  and  easier.  After  adjusting  the 
values  of  some  of  the  cluster  parameters  and  defining  a  simple  split/combine 
sequence,  the  complete  scene  (defined  by  the  field  BASEIMAGE)  was 
classified.  Table  5  shows  the  values  of  the  starting  vectors  (the  same  values 
as  the  mean  vectors  for  classes  9  through  12,  listed  in  table  2),  the 
clustering  parameters,  the  interim  class  statistics,  and  the  final  cluster 
population.  Although  the  clustering  parameters  were  discussed  only  briefly  in 
the  previous  section,  a  complete  description  can  be  found  in  Rice,  Shipman, 
and  Spieler.  However,  note  that  the  parameter  NVMMAX  has  limited  the 
maximum  number  of  clusters  to  four  and  that  a  large  value  of  R2  has  forced 
most  of  the  points  away  from  the  null  cluster  and  into  one  of  the  four 
clusters  (for  small  values  of  R2  most  points  would  be  assigned  into  the  null 
cluster) .  Also  the  split/combine  sequence  was  defined  very  simply  as  one 
split.  The  final  results  of  the  clustering  are  shown  as  a  class  map  in  figure 
11. 


^W.  Rice,  J.  Shipman,  R.  Spieler,  Interactive  Digital  Image  Processing 
Investigation  prepared  for  U.S.  Army  Engineer  Topographic  Laboratories,  Fort 
Belvoir,  VA,  ETL-0172,  December  1978,  AD-A076  342. 


Table  4.  Subarea  of  MULIK's  Class  Hap  Results 
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Table  5.  Listing  of  Cluster  Results 


STARTING  UECTORS  (CLUSTER  CENTERS) 


CHANNEL  CLUSTER  1  CLUSTER  a  CLUSTER  3  CLUSTER  4 


1  129.93  S6.SS  156.41  339.59 
3  199.13  113.14  33.99  184.74 
3  115.32  174.37  89.99  137.46 


THE  CURRENT  VALUES  OF  THE  INITIAL  CLUSTERING  PARAMETERS  ARE 


t.  T1 

• 

4.5 

2.  T8 

• 

3.2 

3.  NNIN 

■ 

39 

4.  NUNNAX 

• 

4 

5.  SEP 

■ 

1.9 

6.  ISODAT 

■ 

1 

7.  IDISF 

• 

2 

8.  P 

• 

9.9 

9.  R2 

• 

299. 

19.  PNAX 

• 

19 

11.  PN 

• 

1 

12.  SPLIT/COMBINE  SEQUENCE 

INTERIM  CLUSTER  STATISTICS 


CLUSTER  1  CLUSTER  3  CLUSTER  3  CLUSTER  4 


i  ■  • 

1 

2 

3 

HE  AH 
129.73 
124.39 
12S.96 

ST. DEV 
32.68 
36.  S9 
33.59 

NEAN 

77.86 

138.77 

167.78 

ST. DEV 

32.78 

38.79 
32.53 

HE  AH 
166.19 
95.39 
86.41 

ST. DEV 
17.38 
19.97 
17.51 

MEAN 

169.52 

183.34 

136.99 

ST . DEV 
35.9 
14.3' 

48.1 

INTERIM  CLUSTER 

STATISTICS 

CLUSTER 

1 

CLUSTER 

3 

CLUSTER 

3 

CLUSTER 

4 

fUAMMCt 
v  t 

i 

3 

3 

(IE  AH 
122.69 
129.14 
127.91 

ST. DEV 
26.62 

19.38 

21.12 

ME  AH 

89.29 

141.14 

169.42 

ST. DEV 
32.93 
37.43 
39.89 

NEAN 

163.96 

94.72 

89.51 

ST. DEV 
39.78 
15.15 
19.48 

NEAN 

155.92 

182.36 

135.71 

ST. DEV 
35.51 
13.97 
44.54 

MINPOP  (PN+NCHAN)  ■  4 

NO.  OF  CLUSTERS  •  4 

CLUSTER  POPULATIONS 


CLUSTER 


POPULATION 


Table  6.  Statistical  Parameters  of  Principal  Components 


BAND 

MEAN 

ENHANCEMENT  OPTION  3  IMAGE  A54PRINC3 

EIGEN  FRACTION  OF 

VARIANCE  VALUES  TOTAL  VAR. 

1 

125.58 

1788.9 

3924.9 

2 

135.53 

3821.6 

1199.8 

.491.il 

3 

139.61 

2173.6 

568.13 

.279R6 

BAND 

MEAN 

ENHANCEMENT  OPTION  3  IMAGE  A54PRINC8 

EIGEN  FRACTION  OF 

UARIANCE  VALUES  TOTAL  VAR. 

1 

125.19 

1317.9 

6578.5 

.19373 

2 

139.56 

1129. 1 

2446.8 

. 88157E-91 

3 

137.97 

2926.9 

946.26 

.15945 

4 

128.78 

1697.4 

757.93 

.12651 

5 

132.68 

1544.1 

699.22 

.  1215-3 

6 

129.71 

1181 .5 

544.81 

. 9291 8E-91 

7 

122.79 

1956.5 

282.57 

. 8315  5E-91 

8 

142.42 

2852.1 

143.58 

.224*7 

Part  B  of  the  experiment  used  principal  component  images  derived  from  a 
selected  number  of  the  images  already  generated  by  program  TEXLAW  in  part  A. 
Two  trials  were  made;  one  decreased  the  number  of  components,  whereas,  the 
other  increased  the  number.  In  the  first  trial,  the  three  components  used  in 
part  A  were  transformed  into  principal  components,  and  then  a  composite  image 
consisting  of  the  two  most  significant  ones  was  classified  by  the  supervised 
algorithm  MAXLIK.  The  image's  covariance  matrix,  needed  by  the  KLTRAN  PM  to 
perform  the  transformation,  was  computed  from  the  statistics  of  the  points 
within  the  field  BASEIMAGE  (enclosing  the  entire  image)  using  CLAST  AT.  A  call 
to  KLTRAN  then  produced  three  principal  components,  and  using  cue  of  the  PM*  s 
options,  two  DIAL  image  planes  were  extracted.  Table  6  gives  a  listing  of  the 
statistical  parameters  associated  with  the  resulting  components  (see  listing 
under  the  heading  of  A5^PR3NC3).  Components  one  and  two  were  built  into  a 
composite  image;  the  DIAL  planes  are  shown  in  figures  12  and  13. 

After  the  construction  of  a  composite  image,  the  procedure  of  this  first 
trial  followed  the  steps  described  in  part  A.  The  classes  were  created  using 
the  same  fields  as  those  that  had  created  classes  9  through  12,  but  with  the 
new  data  in  the  two-component  composite  image.  Table  7  lists  the  statistics 
of  the  resulting  classes.  Finally,  the  last  step  was  taken,  i.e.  the 
composite  image  was  processed  by  MAXLIK  invoking  the  standard  classifier.  The 
resulting  class  map  is  shown  in  figure  18. 


r.V* 


In  the  second  trial,  eight  of  the  images  generated  by  TEXLAW  in  part  A 
were  transformed  into  principal  components  and  the  results  built  into  an 
eight- component  composite  image.  The  source  images  were  the  three  used  above, 
plus  an  additional  five  corresponding  to  the  energy  components  of  the  LE,  LS, 
LR,  EL,  and  ES  Laws  windows.  The  covariance  matrix  of  the  source  composite 
image  was  computed  via  CLASTAT.  Then,  using  KLTRAN,  followed  by  INTERL,  an 
eight-component  composite  image  consisting  of  principal  components  was 
created.  The  images  of  the  first,  second,  fifth,  and  eighth  principal 
components  are  shown  in  figures  14  through  17,  respectively;  these  were  the 
most  interesting  of  the  set.  The  classes  were  created  the  same  way  as  in  the 
first  trial  and  their  statistics  are  listed  in  table  7.  The  scene  was 
classified  twice.  In  the  first,  the  standard  Bayes  classifier  was  invoked, 
whereas  in  the  second,  an  approximation  to  the  classifier  that  only  considered 
the  diagonal  components  c the  covariances  was  invoked.  The  class  map  results 
from  the  approximated  Bayes  classifier  are  shown  in  figure  19. 

Dlsoussiai  of  Results.  Su  prisinglv,  the  extra  step  of  performing  an 
eneray  operation  over  a  convolved  image  added  little  to  the  appearance  of  a 
derived  scene.  Comparing  the  set  of  convolved  images  derived  from  scene  A 
with  the  corresponding  set  of  energy  images  shows  that  except  for  the 
convolved/energy  pair  associated  with  the  LL  window  there  was  little  visual 
difference  between  the  sets.  The  type  of  similarity  that  is  observed  between 
the  pair  in  figures  6  and  7  also  exists  in  all  the  pairs  except  for  the  LL 
pair.  Thus,  in  screening  out  components  from  the  experiment,  one  can 
eliminate  either  the  convolved  image  set  or  the  energy  image  set  without 
sacrificing  information.  Except  for  the  LL  image,  the  convolved  image  set  was 
eliminated  because  of  the  desire  to  test  the  Laws  texture  measure,  essentially 
contained  within  the  energy  components.  However,  the  convolved  image  set 
would  probably  have  yielded  equivalent  results  and  was  a  simpler  and  less 
expensive  measure. 

The  set  of  ratioed  images  tended  to  inherit  an  undesirable  property  of 
blacking  and  whiting  out  many  areas  (see  figure  8).  This  property  could  have 
been  eliminated  by  adjusting  the  thresholds  used  in  quantizing  the  ratioed 
data.  However,  since  these  images  really  didn't  seem  to  offer  anything  over 
the  energy  components,  they  were  eliminated  from  further  study.  Thus,  out  of 
the  47  derived  images,  17  images  remained. 

Note  that  the  convolved  image  using  the  LL  window  is  a  blurred  version  of 
scene  A,  and  the  corresponding  energy  image  is  equivalent  to  performing  a 
standard  deviation  operation  over  the  image  (see  figures  4  and  5).  Therefore, 
this  pair  of  images  is  synonymous  with  the  Ad-Hoc  components  studied  in  one  of 
CSL's  previous  reports.12  From  a  visual  inspection  of  the  remaining  images, 
the  LL  image  pair  was  found  to  be  the  most  significant.  What  was 
disappointing  was  that  none  of  the  other  images,  except  for  perhaps  the  EE 
energy  image,  seemed  to  yield  any  additional  discriminatory  information.  The 
majority  of  them  were  mostly  noise  and  any  information  that  did  exist  seemed 
to  already  exist  on  the  LL  image.  Of  course,  this  judgment  was  made 
subjectively  on  a  visual  basis;  what  was  needed  was  some  quantitative  testing. 


12M.  Crombie,  N.  Friend  and  R.  Rand,  Feature  Component  Reduction  Through 
Divergence  Analysis.  U.S.  Army  Engineer  Topographic  Laboratories,  Fort  Belvoir 
VA,  ETL-0305,  October  1982,  AD-A123  474. 


Table  7.  Listing  of  class  results  for  principal  components 


Essentially,  the  trials  of  part  A  and  part  B  tested  various  combinations 
of  the  Laws  texture  components  with  the  two-component  Ad-Hoc  image 
descriptor.  In  part  A,  the  EE  Laws  component  was  added  to  the  Ad-Hoc 
descriptor  and  the  resulting  composite  image  was  classified  by  both  the  MAXLIK 
PM  and  the  CLUSTER  PM.  Here,  neither  one  of  the  classification  outputs  showed 
an  advantage  over  the  other.  Considering  the  increased  sophistication  of  the 
classification  procedure  over  that  of  previous  efforts,  plus  the  addition  of  a 
component,  the  results  of  part  A  were  disappointing.  The  confusion  between 
classes  found  in  the  previous  study  on  the  Ad-Hoc  measure  still  existed  in  the 
new  results;  in  fact,  the  four-class  experiment  done  with  one  component 
(extracted  from  a  3X3  window)  in  the  previous  study  did  as  well  as  the  trial 
in  part  A  (see  figure  5B,  ETL-0305  ^). 

In  part  B,  the  use  of  principal  components  had  no  advantage  over  the 
original  components.  During  the  run  that  tested  two  (out  of  three)  principal 
components,  the  building/road  class  had  fewer  false  alarms  along  boundaries 
and  in  some  field  areas;  however,  the  hit  rate  (number  of  correct  hits) 
remained  about  the  same.  The  forest  and  field  classes  had  a  lower  hit  rate 
(see  figure  18). 

SurDrisinglv,  the  run  that  tested  eight  components  did  no  better  than  the 
trial  in  oart  A  that  used  only  three  components. 

As  expected,  the  quantitative  results  of  part  A  and  part  B  verified  the 
subjective  evaluation  made  at  the  beginning  of  the  experiment;  the  lack  of 
information  noticed  during  the  visual  inspection  of  the  images  materialized  in 
the  distance  measures  computed  by  CLASTAT  and  the  classification  results  of 
MAXLIK  and  CLUSTER.  Thus,  the  capability  to  display  descriptor  data  as  an 
image  gives  the  user  an  excellent  means  to  screen  data.  There  is  no  need  to 
go  through  tedious  data  analysis  when  a  quick  and  easy  display  of  an  image 
component  shows  that  component  to  be  predominantly  noise.  Such  a  display 
capability  is  very  suitable  to  an  interactive/semiautomated  system  of  feature 
extraction. 

The  capability  to  display  image  descriptors  also  provides  a  good  way  to 
explain  why  ooints  were  misclassified.  A  comparison  of  figures  4,  5,  and  7 
(images  of  the  three  descriptors)  with  the  class  map  in  figure  10  or  figure  11 
exDlains  much  of  what  went  wrong  in  that  classification.  For  example,  the 
confusion  between  the  light  area  at  the  bottom  right  of  the  image  and  the 
building/road  class  is  due  to  the  strong  similarity  between  the  corresponding 
areas  for  the  images  of  figures  4  and  7.  Also,  the  reason  for  the  difficulty 
in  creating  separate  classes  for  buildings  and  roads  is  easily  seen.  The 
difference  between  these  two  classes  is  not  in  the  statistics  of  the  data,  but 
rather  in  structural  information  not  incorporated  into  the  algorithm.  The 
Laws  texture  measure  was  intended  to  encode  such  structural  information,  but 
this  attempt  failed.  The  reason  is  that  the  measure  is  not  robust;  if  the 
window  operators  could  be  tuned  to  the  size  of  a  feature,  then  structural 
information  might  be  detected.  However,  this  is  impossible  without  ancillary 
data  about  the  image  and  a  knowledge-base  defining  the  characteristics  of  the 
f  eature. 
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1 .  The  DIAL  system  has  the  capability  to  use  derived  image  data  from 
operations  such  as  convolutions  and  texture  In  an  interactive  statistical 
classification  process. 

2.  The  capability  to  display  derived  data  as  images  is  an  excellent  means  of 
screening.  Components  that  lack  discriminatory  information  can  be  spotted 
visually  and  then  eliminated;  problems  in  classification  can  be  anticipated. 

3.  A  simple  two-  or  three-component  image  descriptor  will  perform  as 
effectively  as  other  more  complex  descriptors,  such  as  LAWS  or  MAX-MIN 
texture . 

4.  The  effectiveness  of  statistical  classification  methods  and  texture 
analysis  is  limited. 

5.  Processes  that  can  guide  the  detection  of  textural  patterns  and  aid  in  the 
decision-making  process  are  needed. 
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APPENDIX  A.  Laws  Texture  Data 


Following  a  procedure  suggested  by  Kenneth  Laws, ^  15  component  vectors 
can  be  generated  as  texture  data  from  gray-shade  imagery.  A  three-step 
procedure  is  advocated.  The  first  step  is  to  convolve  the  desired  image 
points  with  16  different  masks,  resulting  in  a  convolved  image  plane  for  each 
mask.  The  set  of  masks  is  defined  by  the  cross-product  computations  of  four 
five-component  vectors: 

L  =  (  1  4  6  U  i) 

E  =  (-1  -2  02  1) 

S  =  (-1  0  2  0  -1) 

R  =  (  1  -4  5-4  1) 


The  letters  stand  for  level,  edge,  spot,  and  ripple.  Multiplying  one  vector! 
by  a  transpose  of  another  (or  the  same)  vector  produces  the  sixteen  5X5 
windows.  When  moved  across  all  possible  pixels  on  an  MXN  image,  an  (M-4)X(N- 
4)  convolved  image  results;  16  windows  produce  16  convolutions.  Adding  a 
border  of  two  pixels  to  each  side  of  the  resultant  image  brings  the  image  back 
to  its  original  size. 

In  the  second  step,  each  point  in  the  16  convolved  images  is  transformed 
to  a  measure  of  texture  energy  by  a  moving  window  operation  that  computes  the 
standard  deviation  of  the  KXK  points  surrounding  it  (Laws  used  K=15).  In  the 
third  step,  the  texture  energy  planes  are  ratioed  to  the  first  plane.  The 
LXL1  window  is  used  for  normalization  since  its  standard  deviation  values  will 
be  larger  than  any  of  the  other  15  planes.  Each  of  the  other  15  planes  is 
divided  by  the  "LL"  plane,  resulting  in  15  texture  energy  planes  that  have 
values  between  0  and  1. 


